AITopics | truth data

Collaborating Authors

truth data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Bayesian latent Gaussian process framework for aerodynamic uncertainty quantification

Davis, Geoffrey, Renganathan, Ashwin

arXiv.org Machine LearningJun-30-2026

Predicting the aerodynamic performance (e.g. lift, drag, and moment coefficients) of an aircraft is challenging -- computational models are biased and direct simulations are prohibitive. A pragmatic way to overcome this limitation is by calibrating low-fidelity computational predictions with experimental measurements. This, however, requires calibrating against \emph{sparse} measurements contaminated with \emph{uncertainty} in both the control inputs and the measured aerodynamic response. We develop a methodology to address this problem based on Gaussian process surrogates and the classical Kennedy-O'Hagan calibration. A surrogate model learned on abundant-but-cheap low-fidelity data is calibrated with a sparse set of measurement data. Crucialy, we develop a Bayesian latent Gaussian process based approach that marginalizes the calibrated surrogate model over the input uncertainty, while also matching the marginal mean and variance of the measured output uncertainty. Once calibrated, our surrogate model predicts the uncertainty in aerodynamic coefficients with very high accuracy, including at extrapolative input settings. We validate our calibrated surrogate model predictions against measurement data with \emph{true} uncertainty intervals to demonstrate that the model places $94.2-95.8\%$ of its predictive samples inside the released $95\%$ truth intervals, with endpoint cumulative probabilities very close to the nominal 0.025 and 0.975 levels.

artificial intelligence, calibration, machine learning, (17 more...)

arXiv.org Machine Learning

2606.28871

Country: North America > United States > Pennsylvania (0.50)

Genre: Research Report (0.82)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Variational Autoencoder for Calibration: A New Approach

Barrett, Travis, Mishra, Amit Kumar, Mwangama, Joyce

arXiv.org Artificial IntelligenceNov-4-2025

In this paper we present a new implementation of a Variational Autoencoder (VAE) for the calibration of sensors. We propose that the VAE can be used to calibrate sensor data by training the latent space as a calibration output. We discuss this new approach and show a proof-of-concept using an existing multi-sensor gas dataset. We show the performance of the proposed calibration VAE and found that it was capable of performing as calibration model while performing as an autoencoder simultaneously. Additionally, these models have shown that they are capable of creating statistically similar outputs from both the calibration output as well as the reconstruction output to their respective truth data. We then discuss the methods of future testing and planned expansion of this work.

artificial intelligence, calibration, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/I2MTC62753.2025.11078954

2511.00475

Country:

Europe (0.28)
Africa > South Africa (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SPICE-HL3: Single-Photon, Inertial, and Stereo Camera dataset for Exploration of High-Latitude Lunar Landscapes

Rodríguez-Martínez, David, van der Meer, Dave, Song, Junlin, Bera, Abishek, Pérez-del-Pulgar, C. J., Olivares-Mendez, Miguel Angel

arXiv.org Artificial IntelligenceJul-1-2025

Exploring high-latitude lunar regions presents an extremely challenging visual environment for robots. The low sunlight elevation angle and minimal light scattering result in a visual field dominated by a high dynamic range featuring long, dynamic shadows. Reproducing these conditions on Earth requires sophisticated simulators and specialized facilities. We introduce a unique dataset recorded at the LunaLab from the SnT - University of Luxembourg, an indoor test facility designed to replicate the optical characteristics of multiple lunar latitudes. Our dataset includes images, inertial measurements, and wheel odometry data from robots navigating seven distinct trajectories under multiple illumination scenarios, simulating high-latitude lunar conditions from dawn to night time with and without the aid of headlights, resulting in 88 distinct sequences containing a total of 1.3M images. Data was captured using a stereo RGB-inertial sensor, a monocular monochrome camera, and for the first time, a novel single-photon avalanche diode (SPAD) camera. We recorded both static and dynamic image sequences, with robots navigating at slow (5 cm/s) and fast (50 cm/s) speeds. All data is calibrated, synchronized, and timestamped, providing a valuable resource for validating perception tasks from vision-based autonomous navigation to scientific imaging for future lunar missions targeting high-latitude regions or those intended for robots operating across perceptually degraded environments. The dataset can be downloaded from https://zenodo.org/records/13970078?preview=1, and a visual overview is available at https://youtu.be/d7sPeO50_2I. All supplementary material can be found at https://github.com/spaceuma/spice-hl3.

artificial intelligence, dataset, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2506.22956

Country:

North America > United States (0.14)
Europe > Switzerland (0.04)
Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
Asia > India (0.04)

Genre: Research Report (0.64)

Industry:

Government > Space Agency (0.48)
Energy (0.46)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Statistical Study of Sensor Data and Investigation of ML-based Calibration Algorithms for Inexpensive Sensor Modules: Experiments from Cape Point

Barrett, Travis, Mishra, Amit Kumar

arXiv.org Artificial IntelligenceMar-9-2025

In this paper we present the statistical analysis of data from inexpensive sensors. We also present the performance of machine learning algorithms when used for automatic calibration such sensors. In this we have used low-cost Non-Dispersive Infrared CO$_2$ sensor placed at a co-located site at Cape Point, South Africa (maintained by Weather South Africa). The collected low-cost sensor data and site truth data are investigated and compared. We compare and investigate the performance of Random Forest Regression, Support Vector Regression, 1D Convolutional Neural Network and 1D-CNN Long Short-Term Memory Network models as a method for automatic calibration and the statistical properties of these model predictions. In addition, we also investigate the drift in performance of these algorithms with time.

data set 1, sensor, truth data, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIM.2024.3372211

2503.13487

Country:

North America > United States (0.14)
Africa > Malawi (0.14)
Africa > South Africa > Western Cape > Cape Town (0.05)
Europe > Sweden (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)

Add feedback

A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth

Elias, Noel

arXiv.org Artificial IntelligenceOct-29-2024

Sonar based audio classification techniques are a growing area of research in the field of underwater acoustics. Usually, underwater noise picked up by passive sonar transducers contains all types of signals that travel through the ocean and is transformed into spectrographic images. As a result, the corresponding spectrograms intended to display the temporal-frequency data of a certain object often include the tonal regions of abundant extraneous noise that can effectively interfere with a 'contact'. So, a majority of spectrographic samples extracted from underwater audio signals are rendered unusable due to their clutter and lack the required indistinguishability between different objects. With limited clean true data for supervised training, creating classification models for these audio signals is severely bottlenecked. This paper derives several new techniques to combat this problem by developing a novel Score-CAM based denoiser to extract an object's signature from noisy spectrographic data without being given any ground truth data. In particular, this paper proposes a novel generative adversarial network architecture for learning and producing spectrographic training data in similar distributions to low-feature spectrogram inputs. In addition, this paper also a generalizable class activation mapping based denoiser for different distributions of acoustic data, even real-world data distributions. Utilizing these novel architectures and proposed denoising techniques, these experiments demonstrate state-of-the-art noise reduction accuracy and improved classification accuracy than current audio classification standards. As such, this approach has applications not only to audio data but for countless data distributions used all around the world for machine learning.

signature, spectrogram, target class, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IJCNN54540.2023.10191897

2410.21557

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

ILLUME: Rationalizing Vision-Language Models through Human Interactions

Brack, Manuel, Schramowski, Patrick, Deiseroth, Björn, Kersting, Kristian

arXiv.org Artificial IntelligenceMay-31-2023

Bootstrapping from pre-trained language models has been proven to be an efficient approach for building vision-language models (VLM) for tasks such as image captioning or visual question answering. However, outputs of these models rarely align with user's rationales for specific answers. In order to improve this alignment and reinforce commonsense reasons, we propose a tuning paradigm based on human interactions with machine-generated data. Our ILLUME executes the following loop: Given an image-question-answer prompt, the VLM samples multiple candidate rationales, and a human critic provides feedback via preference selection, used for fine-tuning. This loop increases the training data and gradually carves out the VLM's rationalization capabilities that are aligned with human intent. Our exhaustive experiments demonstrate that ILLUME is competitive with standard supervised finetuning while using significantly fewer training data and only requiring minimal feedback.

explanation, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2208.08241

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering

Maddikunta, Nikhil, Zhao, Huijun, Keswani, Sumit, Samuel, Alfy, Guo, Fu-Ming, Srishankar, Nishan, Pardeshi, Vishwa, Huang, Austin

arXiv.org Artificial IntelligenceDec-16-2021

In the past, computer vision systems for digitized documents could rely on systematically captured, high-quality scans. Today, transactions involving digital documents are more likely to start as mobile phone photo uploads taken by non-professionals. As such, computer vision for document automation must now account for documents captured in natural scene contexts. An additional challenge is that task objectives for document processing can be highly use-case specific, which makes publicly-available datasets limited in their utility, while manual data labeling is also costly and poorly translates between use cases. To address these issues we created Sim2Real Docs - a framework for synthesizing datasets and performing domain randomization of documents in natural scenes. Sim2Real Docs enables programmatic 3D rendering of documents using Blender, an open source tool for 3D modeling and ray-traced rendering. By using rendering that simulates physical interactions of light, geometry, camera, and background, we synthesize datasets of documents in a natural scene context. Each render is paired with use-case specific ground truth data specifying latent characteristics of interest, producing unlimited fit-for-task training data. The role of machine learning models is then to solve the inverse problem posed by the rendering pipeline. Such models can be further iterated upon with real-world data by either fine tuning or making adjustments to domain randomization parameters.

document processing, randomization, recognition, (16 more...)

arXiv.org Artificial Intelligence

2112.0922

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.48)

Add feedback

Computer vision in AI: The data needed to succeed

#artificialintelligenceApr-30-2021, 04:20:34 GMT

Developing the capacity to annotate massive volumes of data while maintaining quality is a function of the model development lifecycle that enterprises often underestimate. It's resource intensive and requires specialized expertise. At the heart of any successful machine learning/artificial intelligence (ML/AI) initiative is a commitment to high-quality training data and a pathway to quality data that is proven and well-defined. Without this quality data pipeline, the initiative is doomed to fail. Computer vision or data science teams often turn to external partners to develop their data training pipeline, and these partnerships drive model performance.

computer vision team, external partner, pipeline, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (0.71)
Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback

Automatic trajectory recognition in Active Target Time Projection Chambers data by means of hierarchical clustering

Dalitz, Christoph, Ayyad, Yassid, Wilberg, Jens, Aymans, Lukas, Bazin, Daniel, Mittig, Wolfgang

arXiv.org Machine LearningJul-10-2018

The automatic reconstruction of three-dimensional particle tracks from Active Target Time Projection Chambers data can be a challenging task, especially in the presence of noise. In this article, we propose a nonparametric algorithm that is based on the idea of clustering point triplets instead of the original points. We define an appropriate distance measure on point triplets and then apply a single-link hierarchical clustering on the triplets. Compared to parametric approaches like RANSAC or the Hough transform, the new algorithm has the advantage of potentially finding trajectories even of shapes that are not known beforehand. This feature is particularly important in low-energy nuclear physics experiments with AT operating inside a magnetic field. The algorithm has been validated using data from experiments performed with the Active Target Time Projection Chamber (AT-TPC) at the National Superconducting Cyclotron Laboratory (NSCL).The results demonstrate the capability of the algorithm to identify and isolate particle tracks that describe non-analytical trajectories. For curved tracks, the vertex detection recall was 86% and the precision 94%. For straight tracks, the vertex detection recall was 96% and the precision 98%. In the case of a test set containing only straight linear tracks, the algorithm performed better than an iterative Hough transform. Keywords: Time Projection Chambers, Active Target, Pattern Recognition, Clustering 1. Introduction One of the present aims of modern low-energy nuclear physics is to provide a more complete understanding about the behavior of subatomic matter under large isospin (i.e.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1807.03513

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Brooks County (0.04)
North America > United States > Michigan > Ingham County > Lansing (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Energy (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

Automating automation: Machine learning behind the curtain

#artificialintelligenceFeb-7-2017, 11:36:47 GMT

Robotic process automation (RPA) can be the true antidote to manual, rote work, or it can be our worst nightmare if you listen to all the drama or the hype. RPA centers on the use of artificial intelligence (AI) to apply human-like thinking to streamline a typically manually intensive process or activity; and whether we like it or not, it's here to stay. Take, for instance, the process of data extraction from documents such as invoices. Application of advanced optical character recognition (OCR) and intelligent document recognition can automate a significant amount of the job of data entry typically performed by clerks or specialized data entry staff. Interestingly, human effort is still involved with attaining the ability to hand off a process or task to a machine.

artificial intelligence, machine learning, optical character recognition, (6 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.56)

Add feedback